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What is claimed is: 

1. A method of displaying text information corresponding to a speech 
portion of audio signals of a television program to as a closed caption on an video display 
device, the method comprising the steps of: 

decoding the audio signals of the television program; 

filtering the/audio signals to extract the speech portion; 

parsing the speech portion into discrete speech components in accordance with 
a speech model and grouping the parsed speech components; 

identifyjng words in a database corresponding to the grouped speech 
components; and 

converting the identified words into text data for display on the display device 
as the closed caption 

2. / A method according to claim 1, wherein the step of filtering the audio 
signals is performed concurrently with the step of decoding of later-occurring audio signals 
of the television ppgram and step of parsing of earlier occurring speech signals of the 
television program. 

3 J A method according to claim 1, wherein the step of parsing the speech 
portion into discrete speech components includes the step of employing a speaker 
independent m^del to provide individual words as the parsed speech components. 

4. A method according to claim 1 further including the step of formatting 
the text data into lines of text data for display in a closed caption area of the display device. 

15. A method according to claim 1, wherein the step of parsing the speech 
portion into discrete speech components includes the step of employing a speaker dependent 
model to provide phonemes as the parsed speech components. 
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6. A method according to claim 5, wherein the speaker dependent model 
employs a hidden Markov model and the method further comprises the steps of: 

receiving a training text as a part of the television signal, the training text 
corresponding to a part of me speech portion of the audio signals; 

updating thp hidden Markov model based on the training text and the part of 
the speech portion of the /audio signals corresponding to the training text; and 

applying Ahe updated hidden Markov model to parse the speech portion of the 
audio signals to provide the phonemes. 

7. A method of displaying text information corresponding to a speech portion 
of audio signals of a /television program to as a closed caption on an video display device, the 
method comprising the steps of: 

decoding the audio signals of the television program; 

filtering the audio signals to extract the speech portion; 

receiving a training text as a part of the television signal, the training text 
corresponding to a part of the speech portion of the audio signals; 

generating a hidden Markov model from the training text and the part of the 
speech portioriof the audio signals; 

parsing the audio speech signals into phonemes based on the generated Hidden 
Markov mocfcl; 

identifying words in a database corresponding to grouped phonemes; and 

converting the identified words into text data for presentation on the display of 
the audio/visual device as closed captioned textual data. 
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8. A methocf according to claim 7, wherein the step of filtering the audio 
signals is performed concurrently with the step of decoding of later-occurring audio signals 
of the television program ancy step of parsing of earlier occurring speech signals of the 
television program. 

9. A method according to claim 7 further including the step of formatting 
the text data into lines ofiext data for display in a closed caption area of the display device. 

10. Aj method according to claim 7, further comprising the step of 
providing respective audio speech signals and training texts for each speaker of a plurality of 
speakers on the television program. 

11. / Apparatus for displaying text information corresponding to a speech 
portion of audio signals of a television program to as a closed caption on an video display 
device, the methocy comprising: 

a cjecoder which separates the audio signals from the television program 

signals; 

a' speech filter which identifies portions of the audio signals that include speech 
components and separates the identified speech component signals from the audio signals; 

/ a phoneme generator which parses the speech portion into phonemes in 
accordance with a speech model; 

a database of words, each word being identified as corresponding to a discrete 
set of phonemes; 

a word matcher which groups the phonemes provided by the phoneme 
generator/ and identifies words in the database corresponding to the grouped phonemes; and 



/ a formatting processor that converts the identified words into text data for 
display/on the display device as the closed caption. 
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12. Apparatus according to claim 11, wherein the speech filter, the decoder 
and the phoneme generator are co£ifigured to operate in parallel. 

13. Apparatus according to claim 11, wherein the phoneme generator 
includes a speaker independent speech recognition system. 

14. Apparatus according to claim 11, wherein the phoneme generator 
includes a speaker dependent speech recognition system. 

15. Apparatus according to claim 14, wherein the speech model includes a 
hidden Markov model and the phoneme generator further comprises: 

means for receiving a training text as a part of the television signal, the 
training text corresponding to a part of the speech portion of the audio signals; 

means for updating the hidden Markov model based on the training text and 
the part of the speech portion of the audio signals corresponding to the training text; and 

means for applying the updated hidden Markov model to parse the speech 
portion of the audio sigirals to provide the phonemes. 

16. AI computer readable carrier including computer program instructions 
that cause a computer to implement a method for displaying text information corresponding 
to a speech portion of audio signals of a television program to as a closed caption on an video 
display device, the method comprising the steps of: 

decoding the audio signals of the television program; 

filtering the audio signals to extract the speech portion; 

parsing the speech portion into discrete speech components in accordance with 
a speech model anq grouping the parsed speech components; 
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9 identifying words in a database corresponding to the grouped speech 

10 components; and 

1 1 converting the identified words into text data for display on the display device 

12 as the closed captionJ 
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1 17. / A computer readable carrier according to claim 16, wherein the 

2 computer program/instructions that cause the computer to perform the step of filtering the 

3 audio signals are configured to control the computer concurrently with the computer program 

4 instructions that cause the computer to perform the step of decoding the audio signals of the 

5 television program and with the computer program instructions that cause the computer to 

6 perform the step of parsing the speech signals of the television program. 

! V 1 A8. A computer readable carrier according to claim 16, wherein the 

\ 2 computer program instructions that cause the computer to perform the step of parsing the 

3 speech portion into discrete speech components include computer program instructions that 

4 cause the computer to use a speaker independent model to provide individual words as the 

5 parsed speecjfh components. 

1 / 19. A computer readable carrier according to claim 16 further including 

2 computer program instructions that cause the computer to format the text data into lines of 

3 text data fir display in a closed caption area of the display device. 



1 / 20. A computer readable carrier according to claim 16, wherein computer 

2 program /instructions that cause the computer perform the step of parsing the speech portion 

3 into discrete speech components include computer program instructions that cause the 

4 computer to use a speaker dependent model to provide phonemes as the parsed speech 

5 components. 



